Degenerate Quantum Codes for Pauli Channels 
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A striking feature of quantum error correcting codes is that they can sometimes be used to 
correct more errors than they can uniquely identify. Such degenerate codes have long been known, 
but have remained poorly understood. We provide a heuristic for designing degenerate quantum 
codes for high noise rates, which is apphed to generate codes that can be used to communicate over 
almost any Pauli channel at rates that are impossible for a nondegenerate code. The gap between 
nondegenerate and degenerate code performance is quite large, in contrast to the tiny magnitude 
of the only previous demonstration of this effect. We also identify a channel for which none of our 
codes outperform the best nondegenerate code and show that it is nevertheless quite unlike any 
channel for which nondegenerate codes are known to be optimal. 



It was Shannon [l|] who discovered, by a random coding 
argument, the beautiful fact that the capacity of a noisy 
channel A/" is equal to the maximal mutual information 
between an input variable, X, and its image under the 
action of the channel: 



C = maxx/(A;A/'(A)). 



(1) 



It is remarkable that this maximization is over a single 
input to the channel; it does not require consideration of 
inputs correlated over many channel uses. 

One would hope that, as in the classical case, there is 
some measure of quantum correlations that can be max- 
imized over inputs to a quantum channel to give the ca- 
pacity. Unfortunately, this appears not to be the case. 
The natural generalization of Eq. ([1]) is to replace the 
random variable X with a quantum state p and the mu- 
tual information with the coherent information I'^ giving 



Qi 



^pi'W.p). 



where 



/^(AA,p) = r(/®AA(|0^^)(0^^l)) 



(2) 



(3) 



Here |0ab) is a purification of p. Its use reflects the 
fact that unlike in the classical case, there can be no 
remaining copy of the channel input with which to com- 
pare correlations-instead we must consider the quantum 
state as a whole. The coherent information is defined by 
I^{Pab) = S{pb)-S{pab) with S{p)= - Tr(plogp). 

While we can achieve Qi using a random code on the 
typical subspace of the maximizing p, it has long been 
known that this rate is not always optimal [3, yj . They 
exhibit codes with rates larger than Qi for very noisy 
depolarizing channels which have Qi small or even zero. 

The correct quantum capacity formula is not Qi, but 
instead is given by [J, y, l6| 
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(4) 



where taking the limit ri — > oo means that we must con- 
sider the behavior of the channel on inputs entangled 
across many uses. This multi-letter formula is an ex- 
pression of our ignorance about the structure of capacity 
achieving codes for a quantum channel. 

The difference between these single- and multi-letter 
formulas is intimately related to the existence of degen- 
erate quantum codes. Strictly speaking, degeneracy is 
not a property of a quantum code alone, but a property 
of a code together with a family of errors it is designed 
to correct. More formally, one usually says that a code 
C degenerately corrects a set of errors £ if in addition 
to correcting £, there are multiple errors in £ that are 
mapped to the same error syndrome. In the context of 
probabilistic noise, which we will be concerned with ex- 
clusively, we say that a code C degenerately corrects the 
noise due to a channel Af if it can be decoded with a high 
fidelity and furthermore multiple errors in the set of typ- 
ical errors of Af are mapped to the same error syndrome. 
For the most part, we will be concerned with grossly de- 
generate codes, which have the further property that the 
number of typical errors mapped to each syndrome is 
exponential in the code's block-length. 

For the depolarizing channel considered in [2, y], as 
well as the Pauli channels considered below, Qi is exactly 
the maximum rate achievable with a nondegenerate code. 
That Q > Qi is then established by the construction of a 
massively degenerate code. While this was accomplished 
in the work of [2, |3| , the difference found was over a mi- 
nuscule range of noise parameters and extremely small 
in magnitude. As a result, one may have thought that 
Eq. ([2]) is "essentially correct" , with some minor modifi- 
cations in the very noisy regime. In the decade since the 
appearance of these two works, there has been almost 
no progress on understanding the difference between the 
single- and multi-letter expressions above, a failure which 
has to some extent been tempered by the hope that the 
smallness of the effect would make it amenable to a per- 
turbative analysis. We will show that this cannot be the 
case and in fact that the smallness of the effect found in 



[2, y is more accidental than fundamental. 

Until now, very little has been understood about why 
the degenerate codes of 0, Q work, besides that they 
seem to be highly degenerate. The main contribution of 
this paper is to provide a conceptual explanation of why 
degenerate codes of this type work, along with a related 
heuristic for designing codes for more general channels. 
Using this heuristic, we find better codes for almost all 
Pauli channels, and exhibit cases where the effect of de- 
generacy can be quite large. This large gap between the 
performance of nondegenerate and degenerate codes im- 
plies that a perturbativc approach to is unlikely to be 
useful. 

A secondary contribution we believe to be no less im- 
portant, but which lies on the periphery of the current 
work, is the identification of the two Pauli channel as an 
important piece of the degenerate coding puzzle. This 
channel derives no benefit from the degenerate codes we 
study, but is also quite different from any of the degrad- 
ablc channels, a set of channels including the dephasing 
and erasure channels, and comprising the only channels 
for which which nondegenerate codes arc known to be op- 
timal [SJ . Therefore either there is some other sort of de- 
generate code that will beat Qi or Qi can be optimal for 
nondegradablc channels. Either outcome seems plausi- 
ble, and progress on resolving this dichotomy would nec- 
essarily deepen our understanding of the quantum coding 
problem in general. Along a similar line, we have intro- 
duced a general method for showing that a channel is not 



degradable, taking one of the first steps in the program 
to classify all degradable channels. 

Cat Codes for Pauli Channels. — The codes we will con- 
sider are 77i-qubit repetition code, sometimes called a 
"cat codes" because the code space is spanned by |0)®™ 
and |1)®". These have stabilizers Z1Z2, Z1Z3, . . . Z^Z^ 
and logical operators X = x®™ and Z = Zi, so 
that an error of the form AT "2"^ leads to syndrome 
{ui(B U2, ■ ■ . jUi (B u„i} and in the absence of a recovery 
operation gives a logical error of A"^^®''''. By encoding 
half of \4>^)ab = (|00) -I- |ll))/-\/2 in our repetition code, 
we get the state for which the coherent information in 
Eq. ^ will be more than m times Qi. Sending the B 
system of the resulting \(J)^)ab through A/'p'^™ and subse- 
quently measuring the stabilizers {ZiZi}YL2 l^ads to the 
state p^„ = Srg^o_i^„_i Pr(r)/«)A''-(|(/.+ )(0+|)®|r)(r|, 
where r is the syndrome measured, N^ is the induced 
channel given r (which is also a Pauli channel), and Pr(r) 
is the probability of measuring r. Concatenating our rep- 
etition code with a random stabilizer code allows com- 
munication with high fidelity at a rate of 
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l^Pr(r)r(/®Ar(|0+)(,/,+ |)). (5) 



Because the repetition code is highly symmetric we 
can find explicit formulas for both Pr(r) and A/""", and 
thus a fairly compact expression for P{pABm)- The joint 
probabilities of logical errors and syndrome outcomes are 



m. 



r 



where r ~ |r|, the Hamming weight of r. Eq. ([6|) allows 
us to find both Pr(r) and the error probabilities of the 
induced channels Af'' . This formula depends on r but has 
no other dependence on r, which dramatically reduces the 
number of induced channels that need to be considered. 
By evaluating ^ on the probabilities of ([6|); we find 
that for almost all Pauli channels there is a repetition 
code with nonzero rate at the hashing point. When px ^ 
Pz the best code is in the Z basis with length scaling 
like 1/pz, which we'll study in detail in the next section. 
For Px ^ Pz ^ Py it is a good rule of thumb to use a 
Z repetition code of length m « 1/pz, with the largest 
increase in rate for fairly asymmetrical channels (Fig. [T]). 

Repetition Lengths for Almost Bitflip Channels. — To il- 
lustrate the tradeoff determining the best repetition code 
length, we will study their performance for channels with 
independent phase and amplitude error probabilities. An 
error A"Z" is said to be a phase error ii v = 1 and an 



amplitude error if w = 1 (note that when u ~ I and 
w = 1 it is both). Throughout, we define qx~Px+Py and 
qz=Pz+Py to be the amplitude and phase error probabil- 
ities, respectively, and in a slight abuse of terminology 
refer to amplitude and phase errors as X and Z errors, 
with a Y error being "both X and Z." Independence 
of phase and amplitude errors requires px = qxi^—Qz), 
Py = qxQz, and pz = qz{l-qx)- When qx > qz we find 
that the repetition code with the best zero-rate noise 
threshold has m « l/^z, which can be understood by 
considering the effective channels induced by the code. 

The independence of phase and amplitude, together 
with our generators involving only Z's tells us that the 
probability of a logical phase error is independent of 
r, and given by qz=PH®Zi^i^^) = [l-(l-2g.)'"]/2, 
which also follows from Eq. ©. 

As we have already seen, the probability of a logical 
amplitude error depends only on r = |r|, not on r itself. 
If m is large, the probability distribution of r becomes 
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function of m, the size of the outer cat code. If we first 
use a 3-cat code in the Z basis, foUowed by an m-cat 
code in the X basis, we find the highest threshold for 
a 3 in 19 code, with a nonzero rate up to p « 0.19086, 
surpassing the codes of [3[. Starting with a 5-cat code 
the threshold increases up to p « 0.19088 for m = 16, the 
best known code for this channel, but for higher values 
of m the computation of this probability is quite slow. 
Based on the character of the channels induced by the 
inner repetition code, together with the behavior for m < 
16 we expect that the threshold increases until something 
like 5 in 25, at which point a larger m begins to reduce 
the effectiveness of the code. 



FIG. 1: Best Z-cat code rates for independent phase and 
amplitude errors with qz/qx = {Pz +Py)/{Px + Py) =9 (and 
where p — Px+Py+Pz)- The optimal m increases with p. 
m = 33 gives the best threshold of r; .295, compared to a 
hashing threshold less than .274. The rule of thumb m « 1/pz 
gives an estimated m — 36, not far from the optimum. 



concentrated near Tq = {'m—l)qx and ri = (m— 1)(1— q^). 
This is because there are typically {m—l)qx X errors on 
qubits 2 through m and these qubits all get flipped if 
qubit 1 has an X error. So, the measured value of r tells 
us whether or not a logical X error has occurred, at least 
with high probability. One can see from this, together 
with the q^ above, that as m increases we learn more 
about the logical X error at the expense of knowing less 
about the logical Z. 

Indeed, the optimal repetition length will minimize the 
entropy in the logical qubits conditioned on r, which near 
the hashing point occurs when the repetition length is 
around 1/qz, at which point almost all of the X entropy 
has been removed. If we increase m beyond this the 
gain in information about the logical X is less than the 
resulting decrease in our knowledge of the logical Z's. 
The overall rate thus achieved at the hashing point is 
2r7,ln(l/g,)/lnln(l/g,). 

Note that essentially all of the entropy in the X errors 
is removed by the best code, with the optimal length de- 
termined by a tradeoff between the reduction of entropy 
in the X errors and the increase of entropy in the Z er- 
rors. This sort of tradeoff also determines the optimal 
repetition code length for a general Pauli channel. 
Concatenated Repetition Codes. — We can immediately 
apply this analysis to design even better codes by using 
concatenation. By adapting a second level of repetition 
code to the error probabilities of the channels induced 
by the first level we can exceed the performance of any 
single level cat code. We have used this approach for the 
depolarizing channel with the results shown in Fig. [21 
where we plot the probabilities at which the rate of a 
concatenated 3 in m and 5 in to code goes to zero as a 




5 10 15 20 25 

outer repetition code length 

FIG. 2: Error probability where the rate goes to zero, as a 
function of length of second level cat code. Here the horizontal 
axis is 771, the length of the second level cat code. The bottom 
line is hashing, the middle line is a 5 qubit repetition code, 
the upper line is a concatenated 5 in 5 repetition code. The 
lower curve is a 3 in m, repetition code; the upper is 5 in m. 



Two-Pauli Channels are Special. — Besides the one-Pauli 
channels, the only channels for which we can find no code 
offering an advantage near the hashing point are tightly 
concentrated near M!p^[p) = (1 - p)p+^XpX+^ZpZ. 
While hashing is optimal for one-Pauli channels 0, Mp^ 
is not known to have additive coherent information, 
which is equivalent to the optimality of hashing. Fur- 
thermore, we will show that unlike all channels known to 
be additive this channel is not degradable [8| . 

Every channel J\f can be expressed as an isometry fol- 
lowed by a partial trace, which is to say there is an isom- 
etry Uj^ : A -^ BE such that J\f{p)=TT:EUj^pUl^. The 
complementary channel of A/", called Af^ , results by trac- 
ing out system B rather than E: N^ {p)=TT:BUj\fpUj^. 
A channel is called degradable if there is a com- 
pletely positive map, T) : B —t E, which "de- 
grades" Af to Af^, so that V o M=N^ . The ex- 
istence of such a map immediately implies the ad- 
ditivity of /'^[8|, which can be seen by noting that 



/^(AA«5("^+"=\/0„inJ</"(AA®"Sp„J+r 



iPna) ex- 



actly when I{En^ ; En^) < I{Bn^ ; Bn^) and recalling that 
I{Bni',Bn^) cannot increase under local operations. We 
now show there is no such V for M^^ when < p < 1 . 
Letting AAf (|i)0-|) = Eki N^J■kl\k){l\ define N and 

define N'^' 
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we find 



N = 



'l-pl2 




V P/2 
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p/2 







p/2 

l-3p/2 





P/2 \ 



i-p/2y 



and 



AfC = 



/p/2 p/2 a a 1-A 

-p/2 a p/2 a 

p/2 a -p/2 a 

\p/2 p/2 -a -a \-p) 



where a ~ \Jv{^ ^p)/2- If AAp^ is degradable, there 
must be a CPTP map V such that VoM = M^ , which 
is equivalent to ND = N^ , with D defined by 2?(|s)(i|) = 



y D 



st:uv\U)'yV\ 



For iV and A^*^ as above, this gives 



B = 



(pI2 p/2 /3 /? l-p\ 

0-7/?70 0/?0 

7-/37O 0/30 

Vp/2 p/2 -/3 -/3 l-p/ 



with ^ = ^p/(2-2p) and 7 = p/(2 - 4p). The Choi 
matrix [i| of P, C^^ ; = Aj;fci, is thus 
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which contains the subblock 



(P/^%-^). This has a nega- 

tive eigenvalue for all < p < 1, so that C^ cannot be 
nonnegative and thus V is not CP. 

Besides repetition codes, we have explored concate- 
nated repetition codes for ACJp, all of which performed 
worse than the hashing rate of l—H{p)—p. This sug- 
gests the capacity oi Np" is exactly l—H{p)—p, and in 
light of its nondegradability we hope a proof of this con- 
jecture will point towards a new sufficient criterion for 
the additivity of coherent information. 
Discussion. — It is tempting to ask if there is a simpler 
characterization of the quantum channel capacity than 
is provided by Eq. ([4]). In particular, contrary to what 
is sometimes claimed, the results of [2|, |3| and this work 
do not rule out a single letter formula for the capacity — 
what is ruled out is the possibility that the single letter 
optimized coherent information is the correct formula. 
It could be that there is a single letter formula for the 
capacity, or less ambitiously simply an efficiently calcula- 
ble expression, which takes degeneracy into account. The 



characterization of capacity in terms of coherent informa- 
tion is fundamentally nondegenerate, and it may be this 
which leads to the necessity of regularization, rather than 
an inherent superadditivity of quantum information. 

More concretely, the two-Pauli channel with equal 
probabilities seems to be somehow different from other 
Pauli channels. Given their success with almost all other 
Pauli channels, the failure of cat codes to beat Qi in 
this case suggests that hashing is optimal. Resolving this 
conjecture seems to be a manageable problem whose so- 
lution may lead to a better understanding of additivity 
questions for quantum channels in general. 

The ideas explored here are also useful for quantum 
key distribution. In particular, using highly structured 
codes for information reconciliation improves the noise 
threshold of BB84 with one-way classical post-processing 
from 12.4% to 12.9% 0. 

Finally, we hope the coding approach suggested by the 
almost bitflip channel will lead to codes with rates be- 
yond what we have presented here. Focusing on reducing 
the amplitude error rate with an inner code while trying 
to avoid scrambling the phase errors more than necessary 
and following this up with a random stabilizer code (or 
perhaps a second similarly chosen code reversing the roles 
of amplitude and phase) offers an appealing heuristic for 
code design. Viewed in this way, the inner codes we have 
considered are quite primitive — a repetition code is the 
simplest code there is — and it seems likely more sophis- 
ticated codes will perform better. 

In summary, we have provided a toolset for studying 
degenerate codes on Pauli channels. We have demon- 
strated channels and codes for which the gap between the 
degenerate and nondegenerate performance is quite large 
compared to previous results, and improved the threshold 
for the more generally applicable depolarizing channel. 
Whether the capacity of the two-Pauli channel can be 
improved by degenerate codes remains an open question, 
the solution to which will likely prove illuminating. 
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